-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use cachetools's LRUCache
to cache manifest list
#1187
Use cachetools's LRUCache
to cache manifest list
#1187
Conversation
@@ -620,6 +623,13 @@ def fetch_manifest_entry(self, io: FileIO, discard_deleted: bool = True) -> List | |||
] | |||
|
|||
|
|||
@cached(cache=LRUCache(maxsize=128), key=lambda io, manifest_list: hashkey(manifest_list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did the use of cachetools
LRUCache, instead of functools.lru_cache
solve the issues with the FileSystem on M1 Macs we were observing before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! the difference is that functools.lru_cache
uses all of the function's args as the cache key, including the io
arg.
Using cachetools
, I can specify which argument to use as the cache key. In this case, only using manifest_list
as the cache key
key=lambda io, manifest_list: hashkey(manifest_list)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's so clever - so the issue was with the function arguments being 'held up' by the cache preventing them from being GCed 🤯 🧠
This is next level @kevinjqliu 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my best theory, haha. Regardless of the underlying cause, we should not be using io
as the cache key!
13fe7d6
to
ade0fc5
Compare
When is the release (0.8.0) planned for - would it be possible to release this fix on a 0.7.2 hotfix release? This solves #1162 bug that is blocking usage, unless there is a workaround. Thanks! |
Closes #595, Reverts #787, Closes #1162
Reimplement manifest list cache using the
cachetools
library. This keeps the manifest list cache as a global cache and introduces the ability to specify the cache key as onlymanifest_list
.The cache result is stored as
Tuple
instead ofList
to prevent modificationMove the
_manifests
function fromsnapshots.py
tomanifest.py
since its a global manifest cache and not specific to any 1 snapshot